BitCube: A Bottom-Up Cubing Engineering
نویسندگان
چکیده
Enhancing on line analytical processing through efficient cube computation plays a key role in Data Warehouse management. Hashing, grouping and mining techniques are commonly used to improve cube pre-computation. BitCube, a fast cubing method which uses bitmaps as inverted indexes for grouping, is presented. It horizontally partitions data according to the values of one dimension and for each resulting fragment it performs grouping following bottom-up criteria. BitCube allows also partial materialization based on iceberg conditions to treat large datasets for which a full cube pre-computation is too expensive. Space requirement of bitmaps is optimized by applying an adaption of the WAH compression technique. Experimental analysis, on both synthetic and real datasets, shows that BitCube outperforms previous algorithms for full cube computation and results comparable on iceberg cubing.
منابع مشابه
Star-Cubing: Computing Iceberg Cubes by Top-Down and Bottom-Up Integration
Data cube computation is one of the most essential but expensive operations in data warehousing. Previous studies have developed two major approaches, top-down vs. bottomup. The former, represented by the MultiWay Array Cube (called MultiWay) algorithm [25], aggregates simultaneously on multiple dimensions; however, it cannot take advantage of Apriori pruning [2] when computing iceberg cubes (c...
متن کاملComputing Complex Iceberg Cubes by Multiway Aggregation and Bounding
Iceberg cubing is a valuable technique in data warehouses. The efficiency of iceberg cube computation comes from efficient aggregation and effective pruning for constraints. In advanced applications, iceberg constraints are often non-monotone and complex, for example, “Average cost in the range [δ1, δ2] and standard deviation of cost less than β”. The current cubing algorithms either are effici...
متن کاملEfficient Dynamic Indexing and Retrieval of XML Documents using Three- Dimensional Quasi-BitCube
XML is a new standard for exchanging and representing data on the Internet. Techniques for indexing and retrieval of XML data is drawing increasing attention since they enable one to access certain parts of retrieved documents easily. However, they provide little or no support for adding new documents to an existing document collection, requiring instead that the entire collection be re-indexed...
متن کاملBitCube: Clustering and Statistical Analysis for XML Documents
In this paper, we describe a new bitmap indexing technique to cluster XML documents. XML is a new standard for exchanging and representing information on the Internet. Documents can be hierarchically represented by XML-elements. XML documents are represented and indexed using a bitmap indexing technique. We define the similarity and popularity operations available in bitmap indexes and propose ...
متن کاملMultiway Iceberg Cubing on Trees
The Star-cubing algorithm performs multiway aggregation on trees but incurs huge memory consumption. We propose a new algorithm MG-cubing that achieves maximal multiway aggregation. Our experiments show that MG-cubing achieves similar and very often better time and memory efficiency than Star-cubing.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2009